Skip to content

[ARM32] Eliminate red zone usage in runtime stubs#129398

Merged
jkotas merged 16 commits into
dotnet:mainfrom
cshung:feature-avoid-red-zone
Jun 19, 2026
Merged

[ARM32] Eliminate red zone usage in runtime stubs#129398
jkotas merged 16 commits into
dotnet:mainfrom
cshung:feature-avoid-red-zone

Conversation

@cshung

@cshung cshung commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

On ARM32 Linux, the area below SP is not guaranteed to be preserved across signal delivery. The runtime previously used the red zone (writing below SP without adjusting it) in several stubs, which can cause silent corruption or crashes when a signal is delivered at the wrong moment.

This PR eliminates all red zone usage in ARM32 runtime stubs by replacing sub-SP reads/writes with explicit stack adjustments (push/pop):

  • NativeAOT interop thunks (ThunksMapping.cpp) — use ldr pc dispatch directly from r12, no stack intermediate. This also shrinks THUNK_SIZE from 20 to 12 bytes.
  • NativeAOT UniversalTransition — caller pushes args onto stack before branching; prolog reads them from known stack offsets after saving argument registers.
  • NativeAOT interface dispatch (DispatchResolve.S, StubDispatch.S) — PROLOG_PUSH/EPILOG_POP instead of red zone stores.
  • CoreCLR VTableCallStub — pre-indexed str / post-indexed ldr (actual push/pop).

On ARM32 Linux, the area below SP is not guaranteed to be preserved
across signal delivery. Replace red zone reads/writes with explicit
stack adjustments (push/pop) in:

- NativeAOT interop thunks (ldr pc dispatch, no stack intermediate)
- NativeAOT UniversalTransition (caller pushes args onto stack)
- NativeAOT interface dispatch stubs (PROLOG_STACK_ALLOC instead of
  sub-SP stores)
- CoreCLR VTableCallStub (pre-indexed str/post-indexed ldr)

Guarded by FEATURE_AVOID_RED_ZONE, enabled for ARM32 non-Windows
targets.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@cshung cshung requested a review from MichalStrehovsky as a code owner June 14, 2026 22:53
@dotnet-policy-service dotnet-policy-service Bot added the community-contribution Indicates that the PR has been added by a community member label Jun 14, 2026
@dotnet-policy-service

Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @agocke, @dotnet/ilc-contrib
See info in area-owners.md if you want to be subscribed.

Comment thread src/coreclr/vm/CMakeLists.txt Outdated
@MichalPetryka

Copy link
Copy Markdown
Contributor

Windows ARM32 has a well-defined red zone guarantee

Windows ARM32 is no longer supported.

Windows ARM32 is no longer supported, so every ARM32 target is Linux.
The red zone avoidance is always needed — remove the preprocessor guard
and delete the old red zone code paths entirely.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread src/coreclr/nativeaot/Runtime/ThunksMapping.cpp Outdated
Comment thread src/coreclr/nativeaot/Runtime/ThunksMapping.cpp Outdated
Comment thread src/coreclr/vm/arm/virtualcallstubcpu.hpp Outdated
Comment thread src/coreclr/vm/arm/virtualcallstubcpu.hpp Outdated
@cshung cshung force-pushed the feature-avoid-red-zone branch 2 times, most recently from 7cc9b73 to 59bc77c Compare June 15, 2026 17:31
Comment thread src/coreclr/nativeaot/Runtime/arm/UniversalTransition.S Outdated
Comment thread src/coreclr/runtime/arm/StubDispatch.S Outdated
Comment thread src/coreclr/runtime/arm/StubDispatch.S Outdated
Comment thread src/coreclr/runtime/arm/StubDispatch.S Outdated
Comment thread src/coreclr/runtime/arm/StubDispatch.S Outdated
Comment thread src/coreclr/nativeaot/Runtime/arm/UniversalTransition.S Outdated
cshung and others added 2 commits June 15, 2026 18:16
The ldr pc dispatch needs only 12 bytes (mov r12 + ldr pc), no padding
required. This increases thunks per page from 204 to 341 (67% more).

Also shorten verbose comments per review feedback.

Co-authored-by: Jan Kotas <jkotas@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- StubDispatch: use PROLOG_PUSH/EPILOG_POP {r1,r2} instead of manual
  STACK_ALLOC + str/ldr
- UniversalTransition: replace interleaved ldr/push dance with a single
  PROLOG_PUSH {r0-r3} then load caller args from known stack offsets
- Clean up stale red zone comments

Co-authored-by: Jan Kotas <jkotas@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@cshung cshung force-pushed the feature-avoid-red-zone branch from 59bc77c to 87288df Compare June 15, 2026 18:27
Comment thread src/coreclr/nativeaot/Runtime/arm/UniversalTransition.S
Comment thread src/coreclr/runtime/arm/StubDispatch.S Outdated
Comment thread src/coreclr/runtime/arm/StubDispatch.S Outdated
Comment thread src/coreclr/runtime/arm/StubDispatch.S
Comment thread src/coreclr/runtime/arm/StubDispatch.S
Comment thread src/coreclr/runtime/arm/StubDispatch.S
Co-authored-by: Jan Kotas <jkotas@microsoft.com>
@jkotas

jkotas commented Jun 15, 2026

Copy link
Copy Markdown
Member

/azp run runtime-nativeaot-outerloop

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@jkotas jkotas left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@jkotas

jkotas commented Jun 16, 2026

Copy link
Copy Markdown
Member

Segfaults in many linux arm32 NAOT tests

pushd .
chmod +rwx Microsoft.Extensions.Configuration.FileExtensions.Tests ^&^& ./Microsoft.Extensions.Configuration.FileExtensions.Tests -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing -xml testResults.xml 
popd
===========================================================================================================
/root/helix/work/workitem/e /root/helix/work/workitem/e
DOTNET_DbgEnableMiniDump is set and the createdump binary does not exist: ./createdump
./RunTests.sh: line 173:    18 Segmentation fault      (core dumped) ./Microsoft.Extensions.Configuration.FileExtensions.Tests -notrait category=IgnoreForCI -notrait category=OuterLoop -notrait category=failing -xml testResults.xml $RSP_FILE
/root/helix/work/workitem/e
----- end Mon Jun 15 09:53:04 PM UTC 2026 ----- exit code 139 ----------------------------------------------------------

Could you please take a look?

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@jkotas

jkotas commented Jun 17, 2026

Copy link
Copy Markdown
Member

@MichalStrehovsky PTLA

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates several ARM32 stubs and NativeAOT transitions to avoid writing below sp (red zone) by switching to explicit stack adjustments (push/pop / stack alloc), and updates related thunk/transition conventions accordingly.

Changes:

  • CoreCLR ARM32 interface/vtable-related stubs: replace red-zone saves/restores with stack-based sequences.
  • NativeAOT ARM32 thunk and interop paths: shrink thunk stubs by branching via ldr pc while preserving r12 as the thunk data pointer, and adjust RhCommonStub accordingly.
  • NativeAOT ARM32 universal transition: change extra-argument passing to caller-pushed stack args and update the corresponding stack frame layout and unwind helper logic.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/coreclr/vm/arm/virtualcallstubcpu.hpp Updates VTableCall stub encoding/size logic to use push/pop-style stack ops instead of red zone.
src/coreclr/runtime/arm/StubDispatch.S Replaces red-zone register spills in cached interface dispatch stubs and adjusts slow-path arg passing to universal transition.
src/coreclr/nativeaot/Runtime/ThunksMapping.cpp Changes ARM thunk stub shape and size to branch via ldr pc and keep r12 as data pointer.
src/coreclr/nativeaot/Runtime/StackFrameIterator.cpp Updates ARM universal transition stack frame layout to account for caller-pushed extra args.
src/coreclr/nativeaot/Runtime/EHHelpers.cpp Adjusts ARM unwind helper to compensate for new interface dispatch stack usage on null-this AV.
src/coreclr/nativeaot/Runtime/arm/UniversalTransition.S Switches universal transition extra args from red zone to caller-pushed stack args and updates prolog/epilog accordingly.
src/coreclr/nativeaot/Runtime/arm/InteropThunksHelpers.S Updates RhCommonStub to consume r12 directly (no red-zone load).
src/coreclr/nativeaot/Runtime/arm/DispatchResolve.S Replaces red-zone spills with stack pushes and updates slow-path argument setup for universal transition.

Comment thread src/coreclr/nativeaot/Runtime/arm/DispatchResolve.S Outdated
Comment thread src/coreclr/nativeaot/Runtime/ThunksMapping.cpp
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@jkotas

jkotas commented Jun 17, 2026

Copy link
Copy Markdown
Member

/azp run runtime-nativeaot-outerloop

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

Comment thread src/coreclr/nativeaot/Runtime/arm/DispatchResolve.S
@jkotas

jkotas commented Jun 17, 2026

Copy link
Copy Markdown
Member

/azp run runtime-nativeaot-outerloop

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@jkotas

jkotas commented Jun 18, 2026

Copy link
Copy Markdown
Member

All Arm32 test failures are known

Comment thread src/coreclr/nativeaot/Runtime/arm/DispatchResolve.S
Comment thread src/coreclr/nativeaot/Runtime/EHHelpers.cpp
Comment thread src/coreclr/nativeaot/Runtime/StackFrameIterator.cpp Outdated
jkotas and others added 2 commits June 17, 2026 21:23
Co-authored-by: Michal Strehovský <MichalStrehovsky@users.noreply.github.com>
…USH/EPILOG_POP

- DispatchResolve.S: use PROLOG_PUSH/EPILOG_POP for {r3,r4,r5,r6,r8}, add
  .save {r1,r2} at Hashtable entry, drop lr from push list
- UniversalTransition.S: rewrite prolog to preserve original frame layout
  (push r0-r1, capture caller args, store r2-r3 into caller slots)
- StackFrameIterator.cpp: revert to original UniversalTransitionStackFrame
  layout (no m_callerPushedArgs)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jkotas jkotas requested a review from MichalStrehovsky June 18, 2026 17:59
@jkotas

jkotas commented Jun 18, 2026

Copy link
Copy Markdown
Member

/azp run runtime-nativeaot-outerloop

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@jkotas

jkotas commented Jun 18, 2026

Copy link
Copy Markdown
Member

/azp run runtime-nativeaot-outerloop

@azure-pipelines

Copy link
Copy Markdown
Azure Pipelines successfully started running 1 pipeline(s).

@jkotas jkotas merged commit 083ad8b into dotnet:main Jun 19, 2026
137 checks passed
@cshung cshung deleted the feature-avoid-red-zone branch June 19, 2026 03:40
@cshung

cshung commented Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

Thanks @jkotas and @MichalStrehovsky for the thorough review and guidance! The prolog trick to preserve the original frame layout was particularly elegant — avoiding the TransitionBlock.cs changes made this much cleaner. Appreciated the push toward idiomatic ARM patterns (PROLOG_PUSH/EPILOG_POP, r8 as scratch) and the thunk size reduction as a bonus. Learned a lot from this one.

@jkotas

jkotas commented Jun 19, 2026

Copy link
Copy Markdown
Member

@cshung Thank you for fixing this! It is likely source of some of the intermittent arm32 crashes. How did you find the problem?

@cshung

cshung commented Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

@cshung Thank you for fixing this! It is likely source of some of the intermittent arm32 crashes. How did you find the problem?

I am trying to get it to run on low-end devices without virtual memory support. On those platforms, using red zone will fail pretty easily.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arch-arm32 area-NativeAOT-coreclr community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants